課程名稱 |
資訊檢索與文字探勘導論 Introduction to Information Retrieval and Text Mining |
開課學期 |
102-1 |
授課對象 |
管理學院 資訊管理學系 |
授課教師 |
陳建錦 |
課號 |
IM5030 |
課程識別碼 |
725EU3410 |
班次 |
|
學分 |
3 |
全/半年 |
半年 |
必/選修 |
選修 |
上課時間 |
星期二2,3,4(9:10~12:10) |
上課地點 |
管一204 |
備註 |
本課程以英語授課。 限學士班三年級以上 總人數上限:25人 |
Ceiba 課程網頁 |
http://ceiba.ntu.edu.tw/1021IRTM |
課程簡介影片 |
|
核心能力關聯 |
核心能力與課程規劃關聯圖 |
課程大綱
|
為確保您我的權利,請尊重智慧財產權及不得非法影印
|
課程概述 |
This course will cover the concepts and algorithms of information retrieval and text mining. Theoretical topics, including term extraction, term weighting, vector space model, binary independence model, language model, IR system evaluations, naive bayes classification, Rocchio classification, kNN, k-means, HAC, PageRank, and HITS, will be presented in this course. Meanwhile, programming assignments and term projects will be given to help students understand the development of an IR system. |
課程目標 |
The course is aimed at graduate students or senior undergraduate students who are interested in information retrieval and text mining. The first part of the course will cover the basics of information retrieval. Then, research topics, such as text classification and clustering, will be discussed to provide a comprehensive study on information retrieval and text mining. |
課程要求 |
Programming language, data structure, and probability. |
預期每週課後學習時數 |
|
Office Hours |
每週四 11:00~12:00 |
指定閱讀 |
|
參考書目 |
Christopher D. Manning and Hinrich Schutze, Foundations of Statistical Natural
language Processing, The MIT Press, 1999.
William B. Frakes and Ricardo Baeza-Yates, Information Retrieval — Data
Structures and Algorithms, Prentice Hall, 1992.
Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval,
Addison Wesley, 1999.
|
評量方式 (僅供參考) |
No. |
項目 |
百分比 |
說明 |
1. |
期中考 |
25% |
|
2. |
程式作業 |
25% |
(約4次) |
3. |
Term Project |
25% |
|
4. |
期末考 |
25% |
|
|
週次 |
日期 |
單元主題 |
Week 1 |
9/10 |
Syllabus<BR>
Term Vocabulary |
Week 2 |
9/17 |
Term Vocabulary<BR>
Pattree and Chinese Keyword Extraction<BR>
**Programming Assignment 1 |
Week 3 |
9/24 |
Scoring, Term Weighting and the Vector Space Model |
Week 4 |
10/1 |
Scoring, Term Weighting and the Vector Space Model<BR>
Evaluation in Information Retrieval<BR>
**Programming Assignment 2 |
Week 5 |
10/8 |
Evaluation in Information Retrieval <BR>
Relevance Feedback, Query Expansion, and Web Personalization |
Week 6 |
10/15 |
Relevance Feedback, Query Expansion, and Web Personalization <BR>
Probabilistic Information Retrieval |
Week 7 |
10/22 |
Probabilistic Information Retrieval<BR>
Language Models for Information Retrieval |
Week 8 |
10/29 |
Language Models for Information Retrieval <BR>
Link Analysis |
Week 9 |
11/5 |
Midterm |
Week 10 |
11/12 |
Link Analysis<BR>
Text Classification and Naive Bayes |
Week 11 |
11/19 |
Text Classification and Naive Bayes <BR>
** Programming Assignment 3 |
Week 12 |
11/26 |
no class (conference leave) |
Week 13 |
12/3 |
Vector Space Classification |
Week 14 |
12/10 |
Vector Space Classification <BR>
Hierarchical Clustering |
Week 15 |
12/17 |
Hierarchical Clustering <BR>
Topic Detection and Incremental Clustering<BR>
** Programming Assignment 4 |
Week 16 |
12/24 |
Topic Detection and Incremental Clustering <BR>
Flat Clustering |
Week 17 |
12/31 |
Flat Clustering |
Week 18 |
1/7 |
Final |
Week 19 |
01/14 |
IRTM Workshop |
|